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Abstract 

We show that the Eigen model and the asexual Wright-Fisher model can be obtained as 
different limit cases of a unique stochastic model. This derivation makes clear which are the 
exact differences between these two models. 

The two key concepts introduced with the Eigen model, the error threshold and the quasis- 

> 

t" ■ pecies, are not affected by these differences, so that they are naturally present also in population 

o 

' genetics models. According to this fact, in the last part of the paper, we use the classical diploid 

o ' 

• , mutation-selection equation and the single peak fitness approximation to obtain the error thresh- 

old for sexual diploids. Finally, we compare the results with the asexual case. 
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1 Introduction 

The Eigen model was formulated as a deterministic mutation-selection model describing replication 
at the onset of life The study of mutation-selection balance in the Eigen model for very high 
mutation rates led to the development of two new evolutionary concepts: the error threshold and 
the quasispecies [5]. The first refers to the fact that, for a critical value of the mutation probability 
(and for some choices of the fitness landscape, see [2J, [IT], [H]), there is an abrupt transition in 
the asymptotic state of the system from a cloud of mutants organized around a given consensus 
sequence to an almost random distribution of genotypes. The second refers to the fact that, due to 
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the mutational coupling, selection acts on groups of neighbour mutants (called quasispecies) instead 
of individuals. 

Since RNA viruses lack proof reading mechanisms, they have mutation rates orders of magnitude 
higher than DNA based organisms, so that both the error threshold and the quasispecies concepts 
could play a relevant role for these organisms. Indeed, the Eigen model has became the main mathe- 
matical tool in this context (see [10], [11] for recent reviews on the subject). However, some authors 
questioned the relevance of the quasispecies as a paradigm for populations of RNA viruses [12] , [9] , 
[7] , [8] , suggesting that the high heterogeneity in populations of RNA viruses could be due to genetic 
drift, and consequently could be better explained by population genetics models. This contrast could 
lead to the idea that the Eigen model and population genetic models are incompatible mathematical 
models. For example, [6] begins saying: "Some major differences distinguish quasispecies theory 
from the classical selection theories of Darwin and neo-Darwinian geneticists" , while in |10j , one 
can read: "The evolutionary dynamics of RNA viruses are complex and their high mutation rates, 
rapid replication kinetics, and large population sizes present a challenge to traditional population 
genetics". On the other hand, Wilke provided evidence that this is not the case, by showing that 
particular limit cases of the Eigen model give raise to some well known population genetics equations 
[17j . However, the precise mathematical relation between the Eigen model and population genetics 
models remains still unclear. The main purpose of the present paper is to fill this gap, by showing 
that the Eigen model and the haploid asexual Wright-Fisher model can be obtained as different 
particular limits of a unique discrete time stochastic model. This is done in section [2] Motivated 
by the analogies between the Eigen model and the Wright-Fisher model, in section [3] we use the 
classical diploid mutation-selection equation and the single peak fitness landscape approximation, 
to determine the error threshold for sexual diploids. Finally, to determine the influence of syngamy 
on the error threshold, in section[3]we compare the results that we obtained for sexual diploids with 
those holding for asexual diploids. 

2 The stochastic model 

In |13j it was shown that the Eigen model emerges as the deterministic and continuous time limit of 
a stochastic mutation-selection model. By changing the selection procedure of that model, we can 
obtain another stochastic model having again the Eigen model as its deterministic and continuous 
time limit and, at the same time, the asexual Wright-Fisher model as a different particular subcase. 



2 



Let us consider a population of constant size of N individuals of K possible different types 
2i, . . . ,Ik that reproduce asexually. Let Ai be the fecundity of type Ii, Di its degradation rate and 
Qij the probability that an individual of type Ij mutates into type Ii as a result of an unexact 
replication 

In this model selection happens at discrete time steps h. Between a selection event and the successive 
one, the organisms of the different types Ij reproduce, mutate and degradate with their characteristic 
rates. We assume that only the individuals in the parental generation are subject to degradation, 
while the newborns will always reach the next selection step, which will restore the total population 
to N. 

Let us denote with n = (m, . . . , fix), Y,f=i n j = N, the type counts just after a selection event, 
then, according to the above hypotheses, the expected number of individuals of type Ii just before 
the next selection event will be given by 

K 

rrii = rii + hY, (AjQij - DiSij)nj. (2) 

All the quantities appearing in the above equation, with the exception of the n% that are integer 
numbers, can assume real values. Indeed, equation (jSJ should be interpreted as the deterministic 
limit of a stochastic process (see also [H]). Notice that, since in © the quantity hDiiii represents 
the number of U individuals in the parental generation that die before the next selection event, the 
time step length h is bounded by the conditions hDi < 1, i = 1, . . . ,K . 

Selection consists in the extraction with replacement of N individuals from the m population 
with sampling probabilities ipi( n ) equal to their relative frequencies 



: + (AjQjj -DjSjj)', 

Zf =imj N + h Y,f=i (Aj - Dj) Uj 



A(n) := ^ = K - — . (3) 



Notice that, by definition, Y,i V'iC 11 ) = 1 an d that ipi(n) > is granted by the conditions hDi < 1, so 
that the interpretation of the ipi( n ) as probabilities is adequate. 
The Markov matrix of the model will be given by 

P h (n'\n) = Nl , Vi(n)^...^(n)^, (4) 
n\ \ . . . n K ! 
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where n = (m, ... ,n^) and n' = (n' l7 . . . , n' K ) are the type counts in successive generations, with 

Through and (@J, we have defined a family of stochastic models parametrized by the time 
step h. The particular case when 

hDj = 1, Vj, (5) 

corresponds to the case of separated generations, when all the individuals in the parental generation 
die before the next reproductive step. When ([5]) holds, the sampling probabilities ([3]) simplify to 

In this case, equation (|4|) defines the asexual Wright-Fisher model (see, for example, [1]), with the 
viability of type Ii given by Ai. Notice also that the model is now independent of the time step h. 
We conclude that the stochastic model defined by (j4|) reduces to the asexual haploid Wright-Fisher 
model when the generations are separated. 

Let us now consider the deterministic limit of the model (|4|). First of all, let us notice that the 
probability that •n! i = k is simply given by 

P h {n' % = k\n) = (^(n) fc (l - Vi(n) JV " fc ) . (7) 

Accordingly, the expected value of •n! i will be given by 

n[=Zk( N k )Mn) k (l-Mn) N - k ). (8) 



Using the binomial identity 



we get 



tk[ N k y v N - k = Nx{x + y) N -\ (9) 



nJ = JV^(n). (10) 

By substituting the expression for the sampling probability ([3]) with n replaced by its expected value 
n inside (|TU]) . we get the following system of discrete equations 



n'. = — ~ J -" v J ^' J " -■" J (ii) 

1 l + h/NX^iAj-DAnj 
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Dividing equation (fTTj) by N we obtain the equations for the type frequencies fa = rii/N: 



• /' X , i (AjQ l3 - I >,<),,) fa. 
l + h£? =1 (.1, I),) a, 



For h -*■ 0, we have the following asymptotic expansion 



A 



3=1 



A' 



E(^-°i)' 



U=i 



+ 0(ft 2 ) 



from which it follows 



(12) 



(13) 



A 



£(A,Qy-IVtf)' 



A 



E(^--°i)' 



(14) 



The Eigen model equations are obtained by taking the limit h -> of the above expression 



A 
I 



E (^j<9*j - DiSij) fa 



N 



3=1 



(15) 



Clearly, by imposing separated generations (JS|) into equation (JTTJ) we would obtain the deterministic 
limit of the haploid asexual Wright-Fisher model, that coincides with the classical haploid mutation- 
selection model (see, for example, [5]): 



T,f=iAjfa 



(16) 



So, the only differences between the classical haploid mutation-selection model ([TBf and the Eigen 
model (|T5|) is that the first one is obtained by the deterministic model ([TTjl imposing separated 
generations and the latter taking the continuous time limit. Since neither the error threshold, nor 
the quasispecies phenomenon are due to these differences, they are naturally present in both models 
(indeed, see [3] for the quasispecies and |16j for the error threshold in the context of population 
genetics) . 

Just to give a concrete example, let us compute the error threshold according to both models in 
a very special case that allows for a simple analytical treatment. Let us suppose that each type Ij 
is specificated by a genotype of length L (so that K = A L ). Let u be the point mutation probability 
and let us send u -»■ and L -*■ oo in such a way that the genomic mutation rate U = uL stays finite. 
In this limit, the probability of mutation from the type I\ to a different type Ij, j + 1 will be given 
by n = 1 - exp(-t/) and the probability of back mutation will be zero. Let us also consider the single 



5 



peak fitness landscape At = A 2) Dj = D 2 , i > 2, A\ —D\ > A2 - D 2 - The single peak fitness landscape 
is a (very) simplified fitness landscape often used in the Eigen model to get an analytical expression 
for the error threshold (see [2]). Let ^ be the frequency of all the sequences different from </>i: 

&j = f>i, (17) 

i=2 

then the Eigen equations reduce to only two equations: 

0i = (A ie - U -D 1 )4> 1 -cb 1 [(A 1 -D 1 )^ 1 + (A 2 -D 2 )cf, B ], (18) 
<j> B = A 1 (l-e- u )cj )1 + (A 2 -D 2 )0 M -(j )B [(A 1 -D 1 )q> 1 + (A 2 -D 2 )q> B ]. (19) 

The error threshold corresponds to the smallest value of the genomic mutation rate U for which 



lim^i(t)=0. (20) 



t— >oo 



From equations (|T5j) . (fH))) . we get the error threshold 

U t = hx(- — V (21) 

For the classical haploid mutation-selection model (|16[) the above assumptions translate in consid- 
ering a locus with two alleles of relative viability w± = A\ > A 2 = w 2 , with forward mutation given 
by fj, = 1 - exp(— U) and the probability of back mutation being zero. The frequencies (/)[ and </> 2 of 
the two alleles at the next generation, given that they are <f>\ and <f> 2 at the present one will be: 



^ __ Mift-ri (22) 



!>lAl + 02^2 
+ 02^2 



(23) 



0lAi+0 2 -4 2 

The first allele will go extinct, in the asymptotic limit, when 0^ - </>i < for 0i > 0, that implies 

^1-^2 , nA . 
M> A , (24) 



or 

'A, 



U t =ln 



(£)■ 
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This is equivalent to the Eigen model result by keeping into account that the separated generations 
condition (JSJ implies that D\ = D^. By rescaling A\ to 1 and A2 to 1 - s, where s is the selection 
coefficient, we obtain: 

tf* = m(JL). (26) 
3 Error threshold in the sexual diploid Wright-Fisher model 

Given the relation between the Eigen model (TTSl) and the classical haploid mutation-selection model 
(|16p , it seems a natural option to use the classical diploid mutation-selection equation to determine 
the error threshold for sexual diploids. This was indeed done in [16j . but the analytical derivation 
of the error threshold was inaccurate, giving the correct value of the critical mutation probability 
only for some regions of the (h,s) space, where h is the dominance and s the selection parameter. 
To explain the problem with the derivation given in |16j let us briefly recall it. To obtain an 
analytic expression for the error threshold, the authors consider a diploid analogue of the simplifying 
assumptions that we used in the previous section. Namely, they considered a single locus with two 
alleles with mutation probability from the fittest to the worst allele given by 1 - run and vanishing 
back mutation probability (we recall that these simplifying assumptions comes from considering 
a genome of infinite length in the single peak landscape, see the previous section). Using these 
assumptions, the continuous time version of the classical diploid mutation-selection equation for the 
master frequency X\, reduces to (compare with eq. (1) in [16 ]): 

ii = xi(Wx)imn - x\(x, Wx) (27) 

where (in their notations) x = (x\, 1 - x\) is the vector of frequencies, W the viabilities matrix and 
(1 - mn) is the mutation probability of the master sequence. Next, they looked for a stationary 
solution of equation (|27| (sec eq. (17) in [16]): 

xi(Wx)im n -x 1 (x,Wx) =0, (28) 

When solving equation (f2"5| for mn, they neglected the common x\ factor (see eq. (18) in [H]). 
That is, they solved the equation 

(Wx)imvL - (x, Wx) = (29) 
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for mn in the case x\ = 0. However, this procedure is incomplete for two reasons. First, it should 
be checked that for the obtained value of mix there are no more solutions of equation (|2"9")l for 
X\ e (0, 1], because in this case x\ = would not be a global sink for the equation ([27]). Second, if 
mn is such that the left hand side of equation (|29[) is not zero but always negative, x\ = will be a 
sink for equation (|27[) and the corresponding value of mn a candidate for the error threshold even 
if equation ((29)) is not satisfied for x\ = 0. So, in the following we will determine the error threshold 
by keeping into account the above considerations. 

To the effect of determining the error threshold, it is equivalent to consider the time discrete 
or continuous, so that we will use the more traditional discrete time version of the classical diploid 
mutation-selection equation. As usual, we will consider two alleles on an autosomal locus in a 
monoecius random mating population with separated generations. Let us denote by A the fittest 
allele and by p its frequency after the mutation step but before selection, while the frequency of the 
other allele a will be given by 1 - p. Let the relative fitness be given by 1 for AA, 1 - hs for Aa 
and 1 - s for aa. We will denote by \i the mutation probability from A to a and, using the same 
approximation that we considered in the haploid case, we will set the back mutation probability to 
zero. Furthermore, we will restrict our considerations to the case when < h < 1, that is, we will 
neglect undcrdominance and ovcrdominancc. The frequency p' of the A allele after a generation 
(composed by selection followed by mutation) will be given by (see [3]): 

, (l-v)[p 2 +p(l- P )(l-h S )] 

P = = (30) 

w 

where w is the average fitness: 

w=p 2 + 2p(l-p)(l-hs) + (l-p) 2 (l-s) (31) 

We want to determine the minimum value of the mutation probability fi that determines the extinc- 
tion of the fittest allele A from the population given that its initial frequency is po = 1. So, we need 
to find the minimum value of [i that implies p' < p for any p. Since u> > this gives the inequality: 

p(l- H)[p+(l-p)(l-hs)]-wp<0 (32) 
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By eliminating the common p factor we reduce to the quadratic inequality in p: 



ap 2 + bp + c<0 pe(0,l], (33) 



with 



a = s(l-2ft), (34) 
b = fts(l-/i)-2s(l-ft), (35) 
c = (l-hs)(l-n)-(l-s). (36) 

For ft < 1/2, both the a coefficient ([3~4")l and the discriminant A = b 2 - Aac are positive. In this case, 
the parabola ap 2 + bp + c will always have two real roots and will be negative in the region among 
the roots. One of the roots will be zero when c = 0, that is when 

For this value of /i the second root becomes 

3ft - 2 + hs(l - 2ft) 
(l-fts)(2ft-l) 



(38) 



This last quantity will be greater than 1 for ft < 1, that is always satisfied. We conclude that, for 
h < 1/2, the error threshold is given by equation ([3"T]) . In the case ft = 1/2 the equation (f3"T)) remains 
valid by continuity. Alternatively, since a = one can directly solve the inequality 

bp + c<0 7 (39) 

that implies 

u>^. (40) 

l-l/2s V ; 

When h > 1/2, a < and the discriminant can be both positive or negative. The inequality (|3"3")l 
will be satisfied if one of these three conditions is satisfied: 

1. A > and the largest root is less than or equal to zero, 

2. A > and the smallest root is greater than or equal to one, 
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3. A<0. 



Equation (|37|) implies that one of the roots is zero, so that the first condition is satisfied if the second 
root (|38| is less than or equal to zero. This is the case if 

s> H2h^Ty (41) 

Notice that the right hand side of (|4ip is less than zero when 1/2 < ft < 2/3 and it is greater than 
zero but less than one when 2/3 < ft < 1. 

The second condition can never be satisfied. Indeed, a necessary condition for the smallest root 
of a parabola to be greater than one is that also the abscissa value of the vertex x = -b/(2a) be 
greater than one. In our case this translates into the condition 

-hs(u+ 1)>0, (42) 

that cannot be satisfied for our choice of the range of the parameters. 

Regarding the third condition, there exists real fi solutions to A < only when 

2/i-l 

s<— (43) 



Since it holds 

2/i-l 3/i-2 1 



<h<l, (44) 



ft 2 ~ ft(2ft-l)' 2 

the two regions (|4"Tj) and (j4"3"]l cover all the region < s < 1, 1/2 < h < 1. Solving A < and imposing 
ji < 1, we get the solution: 

2(2/i - 1) - h 2 s - 2 J (I - 2/i)7l -2h + h 2 s) 1 2ft -1 

,> V_ ; _ <h<lj 0<s<— (45) 



In the region 

(n 3h ~ 2 \ 2 
mayl 11 I <r <r — 

h 2 2 

we have the two possible solutions for the error threshold: 



/ 3ft-2 \ 2ft- 1 1 
maX (°'^T)] <S< ^ ( 46 ) 



2(2ft - 1) - h 2 s - 2J(1 - 2ft)(l - 2ft + h 2 s) , x 

M 2 = Ws (48) 
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The two solutions fix and have the same value on the curve 



\i\ = [i 2 for s 



3/i-2 
h(2h-l) 



(49) 



Notice that the curve for s (|4"9"|) assumes positive values only when h > 2/3. Since a continuous 
solution for the error threshold must exist in the entire region < h <1, < s < 1, we conclude that 
the error threshold will be given by: 



Msex — Ml 
Msex = A*2 



< h < §, < s < 1 or § < /i< 1, max^O, ^h-l) ) 
\<h<l, s<^^ 



< s < 1 



(50) 



h(2h-l) 

The result given in [TB] coincides with the first line of equation (|50[) . 



4 The asexual diploids case 

It is interesting to compare the result for the error threshold of sexual diploids ([50| obtained in 
the previous section with that of asexual diploids, to evaluate the effect of syngamy on the error 
threshold. To this aim, we now calculate the error threshold for an asexual diploid organism, in 
the usual approximation of infinite genome length and single peak fitness landscape. If we have a 
diploid locus in an asexual organism, we denote with p\, P2 and respectively the frequencies of 
the AA, Aa and aa genotypes and with \x the probability of mutating from A to a, then under the 
above hypotheses, we have that after one generation: 



p = Mp 



(51) 



\ 



(52) 



with M given by 

(1 - 

2ju(l- n)(l-hs) (1- /j,)(l-hs) 
V A* 2 (l _s ) m(1 ~ s ) 1 ~ s , 

Since the equations are linear, there is no need of normalizing. The asymptotic frequencies will be 
given by p^"^ /{Pi + P2°^ + P3°^ ) ■ The outcome will depend on which is the maximum eigenvalue of 
the matrix M. If the maximum eigenvalue is (1 - /j,) 2 , then the three genotypes will coexist, because 
the corresponding eigenvector of M has its three components different from zero. If the maximum 



11 



eigenvalue is (1 - hs), then the AA omozygote will disappear and the other two will coexist, 

because the corresponding eigenvector of M has the first component zero and the other two different 
from zero. Finally, if the maximum eigenvalue is 1 - s, then only the homozygote aa will survive, 
because the corresponding eigenvector of M has only its third component different from zero. The 
first case occurs when 

/j, < mm (hs, 1 - n/TTs) . (53) 



The second case when 



Notice that the condition 



implies 



- — > fi > hs. (54) 

1 - hs 



T - 7 - > hs (55) 



s>— (56) 



Accordingly, the threshold mutation rate for the loss of the homozygote AA is given by 

fi Pl= o = hs max^O, 2 ^ 2 1 j < s < 1. (57) 

Finally when 



/(l-/i)s 1 . \ 

\x > max I , 1 - v 1 - s I . 

\ 1 - /is / 



(58) 



there is the complete loss of the advantageous allele. Notice that 

(l-h)s r 1-vT- 

— > 1 - v 1 - s <=> /i< 

1 - /is s 

and that 



(59) 



-< 1 x/F ^ <l, 0<s<l. (60) 
2 s 



So, the error threshold will be given by 



A*pi =P2 =o 



l-/is ft< » ' 



(61) 
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We can now compare the error thresholds in the sexual and asexual case. Comparing equations 
(f5T7j) . (|BT|) and keeping into account (f5§)) and we see that for h < 1/2 there is no difference in 
the error threshold between the sexual and asexual case. We plot this difference in figure [T] for the 
whole range of variation of h and s. We see that, for h > 1/2 , the advantageous allele is more robust 
to complete loss by mutation in the asexual than in the sexual case. 

In figure [2] we show the difference between the sexual error threshold (|50p and the threshold 
mutation rate for the loss of the advantageous homozygote in the asexual case. We see that in 
this case, as obvious, the advantageous homozygote is much more robust to loss by mutation in the 
sexual than in the asexual case. Indeed, in the sexual case the advantageous homozygote can be 
eliminated only by completely removing the advantageous allele. 

5 Conclusions 

We constructed a stochastic model having the haploid Wright-Fisher model and the Eigen model as 
particular subcases. The haploid Wright-Fisher model is obtained by considering separated genera- 
tions, while the Eigen model is obtained by taking the deterministic and continuous time limit. This 
derivation makes it clear what are the differences between these two important models of mutation- 
selection dynamics. Emerging as a deterministic limit, the Eigen model neglects genetic drift and it 
is almost equivalent to the deterministic limit of the haploid Wright-Fisher model, that is, the clas- 
sical haploid mutation-selection model (1161) . The differences among this model and the Eigen model 
do not invalidate the concepts of quasispecies and error threshold, that consequently are present 
in both models. This suggests to use the classical diploid mutation-selection model to obtain the 
error threshold for sexual diploids. We derived an analytical expression for the error threshold inside 
this model by using the usual approximations of infinite genome length and the single peak fitness 
landscape. We compared this expression with the corresponding expression for asexual diploid or- 
ganisms. No difference emerges when h < 1/2, but, curiously, when h > 1/2, syngamy makes the 
advantageous allele more liable to complete loss by mutation. On the other hand, this is not the 
case for the loss of the advantageous homozygote that in the sexual case, especially for low values 
of the dominance parameter h and high values of the selection coefficient as can be appreciated in 
figure 
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Figure 1: The difference n\ = ^ Pl=P2= o ~ Msox between the mutation threshold for the complete loss 
of the advantageous allele in the asexual (/i Pl=P2= o, see eq. (|61|1 ) and in the sexual case (/x, see eq. 
([50])). versus the dominance h and the selection coefficient s. 
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Figure 2: The difference = /i scx - fi Pl =o between the mutation threshold for the loss of the 
advantageous homozygote in the sexual (/x, see eq. (|50[1 ) and in the asexual case (^t Pl= o, see eq. 
(foTj) ). versus the dominance /i and the selection coefficient s. 
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